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ABSTRACT 

Social networks often encode community structure using mul¬ 
tiple distinct types of links between nodes. In this paper we 
introduce a novel method to extract information from such 
multi-layer networks, where each type of link forms its own 
layer. Using the concept of Pareto optimality, community de¬ 
tection in this multi-layer setting is formulated as a multiple 
criterion optimization problem. We propose an algorithm for 
finding an approximate Pareto frontier containing a family of 
solutions. The power of this approach is demonstrated on a 
Twitter dataset, where the nodes are hashtags and the layers 
cotTespond to (1) behavioral edges connecting pairs of hash- 
tags whose temporal profiles are similar and (2) relational 
edges connecting pairs of hashtags that appear in the same 
tweets. 

Index Terms — Community detection, multi-layer net¬ 
works, Twitter 

1. INTRODUCTION 

Social networks have become rich sources of data for network 
analysis, where objectives might include community detec¬ 
tion, edge prediction, node behavior prediction, and model 
inference. However, it has become increasingly difficult to 
extract meaningful information from these networks due to 
the explosion in both the volume of data collected and the 
diversity of available data types. In this paper we focus on ad¬ 
dressing the latter problem for the task of community detec¬ 
tion; specifically, we consider networks containing multiple 
layers of interactions between nodes. 

For many social network applications, measures of associ¬ 
ation between pairs of nodes may be available along multiple 
dimensions. For example, graph edges may be observed di¬ 
rectly in the data, or they may be inferred from actions of the 
agents in the network. We make the distinction between rela¬ 
tional links that are observed explicitly and behavioral links 
that are inferred from ancillary data describing node behav¬ 
ior. Examples of relational links between users might include 
observed interactions over a period of time, mutually estab¬ 
lished friendship connections, or email sender-reciever rela- 
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tionships. Likewise, behavioral links might be drawn between 
users who post items with similar semantic content, like the 
same bands or movies, or exhibit correlated activity over time. 
Further, it is possible to have multiple types of relational and 
behavioral links; for instance, there could be both a profes¬ 
sional and personal social network over the same set of users. 
Networks with multiple distinct edge types have been called 
multi-layer m, multi-level 0, multi-relational, or multiplex 
El networks. 

In a multi-layer network, each layer may have a unique 
topology. The simplest way to apply existing network analy¬ 
sis algorithms (which generally assume homogeneous edges) 
is to “flatten” the data, i.e., to combine all the different types 
of links into a single-layer network. This can be accomplished 
in various ways, for instance, by performing a logical AND or 
OR on the layer-specific adjacency matrices, or by computing 
their weighted (and possibly thresholded) average. However, 
this approach has many hidden pitfalls; for example, if one 
of the layers is noisier than the others then it probably should 
not receive equal consideration when attempting community 
detection. 

A better strategy, we argue, is to directly analyze the multi¬ 
layer networks without flattening. To show how this can be 
done, we propose a new method of community detection for 
multi-layer networks. Our approach employs multi-objective 
optimization, taking into account multiple layers of network 
structure, which is then used to find a community partition. 
We show that this algorithm can provide significantly better 
community detection than that obtained by standard single¬ 
layer techniques. 

The paper proceeds as follows. In Sec. |2]we define multi¬ 
layer networks. In Sec. [3] a Pareto optimality approach to 
multi-layer community detection is proposed, and in Sec. 0] 
we apply the proposed approach to Twitter data. Finally, we 
discuss related work in Sec. |5]and give concluding remarks 
in Sec. |6] 

2. MULTI-LAYER NETWORKS 

A multi-layer network G = (V,f) consists of vertices 
V = {wi, ... ,Vp}, common to all layers, and edges £ = 
(fi, ..., £m) in Af layers, where £k is the edge set for layer 
fc, and £k = ; Vi,Vj € V}. Each edge is undirected, 

though extensions to the directed case are not difficult. The 



multi-layer degree of a node i is d* € with each entry 
being the degree of node i on layer k. 

The adjacency matrix and degree matrix are dehned as 
usual for each layer: 

[[A%j = ej.„. = diag([d%, [(1%,[dP]k) (1) 

Note that is simply apxp diagonal matrix with the layer- 
specific node degrees on the diagonal. 

3. COMMUNITY DETECTION VIA 
MULTIOBJECTIVE OPTIMIZATION 

Many existing community detection algorithms involve opti¬ 
mization a. Methods that fall into this category include spec¬ 
tral algorithms, modularity methods, and methods that rely on 
statistical inference, particularly those that try to maximize a 
likelihood function. It seems natural that a multi-layer gen¬ 
eralization of such algorithms might somehow combine the 
optimization objective functions as applied to each individual 
layer; this is the basis of multi-objective optimization. 

More formally, let community structure in a network be 
described by a node partition C, where C{i) = k means that 
node i is in part k. Single-objective optimization methods of 
community detection seek to hnd the partition argminf;/(C') 
that minimizes an objective function / (which depends inter¬ 
nally on the network structure). In the following we consider 
the two community case; more communities can be found by 
a recursive use of the algorithm. 

Now consider a two-layer network, and let fi and /2 be 
objective functions for the two layers. One obvious way of 
combining the layers would be to minimize the linear combi¬ 
nation afi{C) -I- (1 — a)/ 2 (C') over C, where a G [0,1]. 
However, linear combination may be restrictive, especially 
when the objective functions are complex. A more general 
approach is instead to seek the Pareto optimal solutions of the 
multi-objective minimization problem: 

C = argminc- [/i (C), /2 (C)] . (2) 

A solution to the multi-objective optimization problem (|2]i is 
said to be weakly Pareto optimal (or weakly non-dominated) 
if it is not possible to decrease any objective function with¬ 
out increasing some other objective function More for¬ 

mally, a solution Ci dominates a solution C 2 if /i(C'i) < 
/i(C' 2 ) for every objective function fi and there exists some j 
such that /j (Cl) < /j (C 2 ). The hrst Pareto front is the set of 
weakly non-dominated points. 

Calculating an exact Pareto front is, in general, a challeng¬ 
ing task. The most popular approximate methods are genetic 
algorithms, which employ biologically inspired heuristics to 
attempt to transform randomly selected seed cases into solu¬ 
tions on the Pareto front using propagation. More details can 
be found in Q 0 and the references therein. One disadvan¬ 
tage to genetic approaches is that they are not deterministic. 


Input: /i, /2 

Obtain optimum solutions Ci, C| for each layer 

Initialize C = Cl 

repeat 

forCC(i) do 

Qnew ^ ^ Cl(i) 

C0St(i)^/2(C"^“)-/2(C) 

end for 

i* <— argminj cost(j) 

C(i*) G- CX(i*) 

until C = C; 

Output: non-dominated solution values taken by C 

Fig. I. Proposed algorithm for Pareto front identihcation. 

Additionally, there is no guarantee that any of the Pareto front 
will be correctly identihed. Finally, most genetic algorithms 
deal with real-valued decision variables, while the community 
detection problem has a discrete decision space. 

The alternative strategy employed in this paper is based 
on the Kernighan-Lin node swapping technique i). The ob¬ 
jective is to hnd solutions that are approximately Pareto opti¬ 
mal. If it is possible to obtain a sample of solutions that are 
likely to be on or near the front, these points can be sorted for 
non-domination very quickly Q. In this way, a large set of so¬ 
lutions is hltered to hnd candidates that are potentially Pareto 
optimal and worth further consideration. Figure [T] shows the 
proposed algorithm. 

For community detection, the objective is to minimize the 
ratio-cut fk for each layer k = 1,2: 

cut(C) = ^ [A%, (4) 

C{i) = l,CU)=2 

A relaxed version of this objective function can be solved by 
performing an eigendecomposition on the Laplacian Li = 
Di — Ai. More details can be found in ITOl . 

4. TWITTER DATASET 

The proposed algorithm was applied to a month of data from 
Twitter. A two-layer network on hashtags was developed us¬ 
ing tweets from October 2012. The data was obtained from 
the Twitter stream API at gardenhose level access, which cor¬ 
responds to 10% of all tweets over the month. A list of hash- 
tags and the users who tweeted them was created for each day, 
as well as the volume (i.e., number of observed occurrences) 
of each hashtag per day. 

Hashtags that were directly connected with the presiden¬ 
tial election or politics were chosen out of a list of the most 
popular hashtags for the month, which yielded 48 hashtags. 



Hastag Volume for October 2012 
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(a) Hashtag Volume Layer (b) Hashtag User Layer 

Fig. 2. A network visualization of two layers of the hashtag 
dataset for October 10th, 2012. This example shows the dif¬ 
fering topologies generated by different links in a network. 
While we see some similarities—for instance, nodes 38, 39, 
and 32 have high degree centralities in both networks—these 
networks have many differences, the most obvious being that 
the volume layer is not even fully connected, while the user 
layer is fully connected and has a diameter of only 6. 

Figure |2] shows an example of two network layers for one day 
on the original set of 48 hashtags. In order to include some 
higher order connections, the list was expanded by including 
hashtags whose volume per day behaved similarly over the 
month as the hrst 48; this grew the network to 515 tags. 

Initially, the total volume of the hashtags was studied over 
time, and real events were compared with the profile; this is 
shown in Figure |3] Some events are correlated with volume; 
Hurricane Sandy falls on the two day period with the largest 
hashtag volume. The second presidential debate also corre¬ 
sponds to a spike in hashtag volume. In contrast, the first 
presidential debate is not an identihable event in the volume 
plot. 

A time series of two-layer networks was created with 
hashtags as the nodes. Specifically, 31 two-layer networks 
were created by aggregating daily Tweet data over each day 
in the month. The hrst layer linked two hashtags if any user 
used both the hashtags in that particular day. This layer is 
referred to as the hashtag user layer. The second layer linked 
two hashtags if they had similar volume prohles over time. 
Intuitively, two hashtags would have a link with each other 
if they were popular or unpopular at the same time. So as 
not to take into account too much past data, the volume cor¬ 
relation was calculated using a moving window of 5 days. A 
Pearson correlation coefficient was used to calculate the cor¬ 
relations in volume for each pair of hashtags; the correlations 
then underwent a Fisher transformation and were thresholded 
by a value of 1.3859 which corresponds to an approximate 
5% false positive rate (in the bivariate normal case) when 
testing for the presence of a positive correlation ifTTl . This 
layer is referred to as the hashtag volume layer. Figure |4] 
demonstrates pictorially the creation of the two layers, using 
a simple dataset of three hashtags. 

We will show that one is able to obtain more informa- 
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Fig. 3. Volume of observed usage of the 515 political hash- 
tags along with an event timeline for October 2012. Notice 
that while we can see that some events correlate with hashtag 
usage for our dataset, this is not true for all events that might 
be expected to affect political hashtags. 

tion by the proposed Pareto multi-layer analysis methods than 
when the two layers are analyzed separately. To this end, the 
graph-cut partitions (HI were computed for each day. We also 
computed approximately Pareto-optimal partitions by com¬ 
bining the single-layer solutions using Algorithm [H and se¬ 
lected a single partition by using the approximate midpoint of 
the Pareto front. The Adjusted Rand Index (ARI) ifTSll was 
then used to compare partitions on different days and see how 
hashtag relationships change over time. The ARI measures 
how similar partitions are, and can vary between -1 and 1. 

Figure |5] shows heat maps of all the ARI indexes, both 
for the single layers considered separately as well as for the 
proposed algorithm. The hashtag user layer reflects fairly sta¬ 
ble correlation among the two clusters until day 16, where 
there is a phase transition. Note that this phase transition also 
occurs on the volume layer heatmap. There is not much sim¬ 
ilarity between days in the user network, implying that there 
is not an optimal stable two cluster solution when considering 
the hashtag user layer alone, and it is difficult to extract real 
events. 

In the hashtag volume layer heatmap, some community 
structure over days are highly correlated with each other. In 
particular, the days on which Hurricane Sandy occurs have 
communities that are highly correlated. It is also interesting 
to note that the communities at the end of the month are noth¬ 
ing like the bisected communities at the beginning, which im¬ 
plies considerable temporal evolution in the network. There 
is also more sparsity in the hashtag volume layer heatmap; 
consequently it may be possible to detect events more easily 
using this network. 

The evident block structure in the Pareto combined 










Fig. 5. The more highly resolved block structure in combined network heatmap clearly indicates that the hashtag community 
structure remains quite stable and coherent over the first 15 days of October but then breaks up into smaller clusters of coherency 
over the remainder of the month. This may reflect the change of public opinions after the second Presidential debates (October 
16) and the effect of Hurricane Sandy (October 28) on Twitter hashtag volume and usage. 
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single layer solutions. In particular, days 3-5 are more highly 
correlated in the combined solution; October 3rd was the day 
of the first debate. Interestingly, the layers jointly reveal cor¬ 
relations between days not visible in the independent single 
layer analyses. 
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Fig. 4. The two layers of the Twitter hashtag network are illus¬ 
trated. At the top is the relational layer where a link between 
two hashtags indicates that at least one user used both hash- 
tags in the same Tweet. At the bottom is the behavioral layer 
where a link indicates similarity in the hashtag usage volume 
over time. 


heatmap shows that the multi-layer algorithm eliminates 
similarities between the first and second half of the months. 
The Pareto combined solution holds attributes from both the 
hashtag volume layer and hashtag user layer; the structural 
patterns that were present in the latter half of the month of 
the hashtag volume network are also present in the combined 
solution. The first half of the month also has some self¬ 
similarity, which is seen in the hashtag user layer. However, 
the proposed multi-layer algorithm was able to pick out some 
days that were more highly correlated than in either of the 


5. RELATED WORK 

With the advent of large data, there has been more opportunity 
to explore this multi-layer structure. There has been some 
work in the modeling and representation of multi-layer net¬ 
works, and how it relates to other studied problems ifTsl [3l. 
While there is a large body of work in single-layer commu¬ 
nity detection a, the multi-layer community detection liter¬ 
ature is less comprehensive. Hypergraphs have been studied 
from a spectral perspective IIT4l . which can be useful when 
dealing with a multi-layer structure. Some work in applying 
single-layer modularity methods to multi-layer structures is 
also available El. For more information, see a. This tech¬ 
nique was also used in El- 

Multi-objective optimization has a long history a. Here, 
we are only interested in a sorting algorithm used to find 
points that are possibly Pareto optimal; this is called non- 
dominated sorting. The method used in this paper is part 
of the evolutionary algorithm described in Q. Some inter¬ 
esting application work has been done using multi-objective 
optimization El, including supervised and unsupervised 
learning. 


6. CONCLUSION 

Multi-level network analysis is of growing interest as we are 
faced with increasingly complex data. In this paper, a method 































was introduced for finding communities in a multi-layer struc¬ 
ture; it was demonstrated on a Twitter hashtag dataset and 
shown to deliver results that significantly differ from single 
layer analysis alone. The framework described can also be 
applied to other single-layer algorithms for the multi-layer set¬ 
ting. 
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